Plotting Your Data

GVPT399F: Power, Politics, and Data

Data visualization

We will use data visualization to answer the following question:

Do cars with big engines use more fuel than cars with small engines?

Set up your plot

An empty canvas!

ggplot(data = mpg)

Map your aesthetics

ggplot(data = mpg, mapping = aes(x = displ, y = hwy))

Add in your cars

ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) + 
  geom_point()

Look at the relationship across all cars

ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) + 
  geom_point() + 
  geom_smooth(method = "lm")

Let’s look at groups in the data

  • Can look at more than two interesting elements of our data.

  • You can use visual elements or aesthetics (aes) to communicate many dimensions in your data.

  • Let’s look at a categorical variable: the class of car (SUV, 2 seater, pick up truck, etc.).

  • Look for meaningfully defined groups.

Let’s look at groups in the data

ggplot(data = mpg, mapping = aes(x = displ, y = hwy, colour = class)) + 
  geom_point()

Look at the relationship within groups

ggplot(data = mpg, mapping = aes(x = displ, y = hwy, colour = class)) + 
  geom_point() + 
  geom_smooth(method = "lm")

Aesthetics can be isolated

ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) + 
  geom_point(aes(colour = class)) + 
  geom_smooth(method = "lm")